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ABSTRACT 

Embedded in the sequence of each transfer RNA are elements that promote specific interactions with its cognate aminoacyl tRNA- 
synthetase. Although many such "identity elements" are known, their detection is difficult since they rely on unique structural 
signatures and the combinatorial action of multiple elements spread throughout the tRNA molecule. Since the anticodon is 
often a major identity determinant itself, it is possible to switch between certain tRNA functional types by means of anticodon 
substitutions. This has been shown to have occurred during the evolution of some genomes; however, the scale and relevance 
of "anticodon shifts" to the evolution of the tRNA multigene family is unclear. Using a synteny-conservation-based method, 
we detected tRNA anticodon shifts in groups of closely related species: five primates, 12 Drosophila, six nematodes, 11 
Saccharomycetes, and 61 Enterobacteriaceae. We found a total of 75 anticodon shifts: 31 involving switches of identity 
(alloacceptor shifts) and 44 between isoacceptors that code for the same amino acid (isoacceptor shifts). The relative numbers 
of shifts in each taxa suggest that tRNA gene redundancy is likely the driving factor, with greater constraint on changes of 
identity. Sites that frequently covary with alloacceptor shifts are located at the extreme ends of the molecule, in common with 
most known identity determinants. Isoacceptor shifts are associated with changes in the midsections of the tRNA sequence. 
However, the mutation patterns of anticodon shifts involving the same identities are often dissimilar, suggesting that alternate 
sets of mutation may achieve the same functional compensation. 
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INTRODUCTION 

Transfer RNAs (tRNA) are involved in two functionally relat- 
ed but physically decoupled reactions that are integral to pro- 
tein synthesis (Soli and RajBhandary 1995). First, the amino 
acid identity of a tRNA molecule is determined by its recog- 
nition and aminoacylation by an aminoacyl tRNA-synthetase 
(aaRS) enzyme (Goodman and Rich 1962). Specific recogni- 
tion by a synthetase is facilitated by the tRNA's sequence and 
structural features (McClain 1993; Giege et al. 1998). The 
second reaction entails donation of the charged amino acid 
at the ribosome, following mRNA codon recognition by 
the tRNA anticodon (Ogle et al. 2001). Together these two 
reactions determine the sequence of protein. Translational 
accuracy is therefore dependent on the correspondence of a 
tRNA's identity and its anticodon and on their accurate re- 
flection of the genetic code. 

tRNAs are ~76 nt in length (Marck and Grosjean 2002). 
The number of residues in stem and loop regions is suffi- 
ciently conserved so that they can be referenced by a standard 
numbering system (Sprinzl et al. 1991). Length differences 
occur in the variable loop and the D-loop. In many eubacte- 
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ria and archaea and in virtually all eukaryotes, residues 74-76 
(nearly always CCA) are added post-transcriptionally and are 
not coded for by the tRNA genes (Deutscher and Ni 1982). 
The amino acid is attached to base A76 in the acylation reac- 
tion (Delagoutte et al. 2000; Xiong and Steitz 2004). Residue 
73 at the end of the acceptor stem is unpaired in all tRNAs 
except tRNA-His, which in eukaryotes contains a G residue 
post-transcriptionally added to the 5' terminus (position 
— 1). The anticodon residues are positions 34, 35, and 36. 

While all aaRSs catalyze the same type of reaction by the 
same mechanism, they share only a few common structural 
features. The synthetase family of enzymes can be divided 
into two distinct classes on the basis of the architecture of 
the catalytic domain (Eriani et al. 1990). Class I synthetases 
approach the tRNA acceptor stem from the minor groove 
and aminoacylate the 2' hydroxyl of the terminal A nucleotide, 
whereas class II synthetases approach from the major groove 
side and generally aminoacylate the 3' hydroxyl (Lapointe 
and Brakier-Gingras 2003). The aminoacylation reaction in- 
volves two transition states that are stabilized by the binding 
energy provided by aaRS-tRNA interactions (Loftfield and 
Vanderjagt 1972). 
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The 20 aaRS enzymes must discriminate between a pool of 
structurally similar tRNAs. Therefore specific elements in 
tRNA sequence and structure must be present to allow syn- 
thetases to recognize and aminoacylate cognate tRNAs and 
avoid interaction with noncognate tRNAs. However, not all 
recognition elements (or "identity" elements) are obvious; 
many tRNAs that accept the same amino acid (isoacceptors) 
have different nucleotide sequences (Fender et al. 2004; 
Sprinzl and Vassilenko 2005), and even tRNAs with the 
same anticodon can have different recognition elements 
(Geslain and Pan 2010). To date, no complete set of identity 
elements is known for the entire tRNA complement of any 
organism. 

The sequence and structural features that promote recog- 
nition by specific aaRSs are termed identity determinants, 
while those that prevent indiscriminate interactions are 
termed antideterminants (Giege et al. 1998). Known tRNA 
identity elements are mostly located at the two opposite 
ends of the tRNA molecule: the anticodon itself, the discrim- 
inator base (unhybridized position 73), and the terminal 
4 bp of the acceptor stem. Position 37 in the anticodon 
loop is also involved in identities of tRNAs charged by class 
I synthetases (Giege et al. 1998). Identity elements in the 
tRNA core (nt 8-31 and 39-65) are more species-dependent 
and aaRS-tRNA system-dependent, are dispersed over nu- 
merous positions, and are individually less strong (Giege 
et al. 1998). 

Identity determinants have been investigated using a vari- 
ety of techniques for many years (Francklyn and Schimmel 
1989; McClain 1993; Giege et al. 1998). Typically these anal- 
yses focus on one tRNA functional class for one model or- 
ganism. Freyhult et al. (2007) point out that this approach 
can potentially reduce perspective on two important aspects 
of the identity problem: how the tRNA identity elements 
of different tRNA functional classes work together in the 
cell and how tRNA identity elements evolve among lineages 
for specific functional classes, and coevolve within lineages 
for different functional classes. It is known that some 
tRNAs have different identity rules in different taxa. For ex- 
ample Escherichia coli tRNA-Gly uses a crucial U73 dis- 
criminator base in combination with a C2:G71 pair, while 
in mammals the discriminator base is A73 (Shiba et al. 
1994; Hipps et al. 1995); prokaryotic tRNA-Asp uses a 
U73 and the first base pair G1:C72 of the stem, while eu- 
karyotes use A73 and the third base pair of the stem 
(Nameki et al. 1992, 1995). Although crucial to our under- 
standing of the evolution of the genetic code (Schimmel 
et al. 1993), the evolution of tRNA identity elements has 
been the subject of few studies. The availability of numerous 
genome sequences and tRNA annotations allows the analysis 
of the tRNA identity system as a whole, and comparative 
genomics methods provide insight into tDNA sequence 
evolution. 

The tRNA multigene family is partitioned functionally ac- 
cording to amino acid specificity such that each tRNA falls 



into one of the 22 standard proteinogenic amino acid accept- 
ing groups. Most functional groups comprise multiple isoac- 
ceptor tRNAs. A reasonable model of evolution for this 
scenario would be that each group arose from a single com- 
mon ancestor with the corresponding amino acid identity. 
However, this classical model of evolution appears to be 
true for only some isoaccepting groups (Saks et al. 1998). 
The evolution of other tRNA groups requires more complex 
explanation. Phylogenetic analyses of Escherichia coli tRNAs 
have revealed that while some isoaccepting groups form dis- 
crete clusters, the isoacceptors of most groups are dispersed 
throughout the dendogram (Saks et al. 1998). It has therefore 
been suggested that some isoacceptors may be derived from 
different ancestors (Cedergren et al. 1981; Fitch and Upper 
1987; Saks et al. 1998). 

Several experiments have demonstrated a simple and plau- 
sible mechanism for a "gene recruitment" process, whereby a 
tRNA sequence is recruited to a different amino acid identity. 
It is possible in vitro to alter a tRNAs amino acid charging 
identity by anticodon mutations (Schulman and Pelka 
1989; Pallanck and Schulman 1991). It has also been shown 
that mutations outside the anticodon, for example, involving 
acceptor stem nucleotides that are strong identity determi- 
nants, can also effectively switch the identity of a tRNA 
(Hipps et al. 1995). That "identity switches" could potentially 
occur by single or a few mutations was first demonstrated 
in vivo by the rescue of an inviable tRNA-Thr(UGU) deletion 
mutant of E. coli by a mutant tRNA-Arg(UGU) gene (Saks 
et al. 1998). A handful of studies have since inferred putative 
instances of tRNA identity switching events during the evolu- 
tionary history of organisms and organelles, in mitochondrial 
genomes (Rawlings et al. 2003; Lavrov and Lang 2005; Wu 
et al. 2012), in vertebrates (Coughlin et al. 2009; Tang et al. 
2009), in Drosophila (Rogers et al. 2010), in nematodes 
(Hamashima et al. 2012), and in primates (Wang and 
Lavrov 2011). One recent study has implicated a tRNA-Arg 
(GCU) to ACU (normally the anticodon of tRNA-Trp) mu- 
tation in a patient with mitochondrial encephalomyopathy 
(Roos et al. 2013). 

By using a micro-synteny-based method based on our pre- 
vious work to identify orthologs of the tRNAs in the 
Drosophila genus, we have carried out an extensive search 
for instances of tRNA gene recruitment events. We searched 
for sets of tRNA orthologs in complete genome sequences 
of five primates, six nematode worms, 12 Drosophila flies, 
11 Saccharomycetes yeast, and 61 Enterobacteriaceae bacte- 
ria. We identified instances of tRNA orthologs with diff- 
erent anticodons (anticodon shifts), gene recruitment events 
involving anticodons of both different tRNA identities 
(alloacceptor shifts), and anticodon mutations that remain as- 
sociated with the same amino acid (isoacceptor shifts). In or- 
der to further study the structural mechanisms of both types 
of tRNA anticodon shift, we searched for mutations in other 
regions of the tRNA that covary with the mutations in the 
anticodon. 
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RESULTS 
Mappings 

We predicted tRNA genes in the genomes of five groups of 
species: five primates, 12 flies (Drosophila), six nematode 
worms (five Caenorhabditis and one Pristionchus), 11 Saccha- 
romycetes yeast (10 Saccharomyces and one Candida)., and 
61 bacteria from the family Enterobacteriaceae, each one 
from a different genus (for list of species, see Supplemental 
File 1). Among the five groups of organisms, there is great 
variety in the size of the genomes, the number of predicted 
tRNAs, and the evolutionary divergences involved (Table 1; 
Supplemental File 2). 

We used sequence similarity in genomic regions flanking 
the tRNAs to identify potential tRNA orthologs in different 
species. A tRNA may have multiple mappings to another ge- 
nome; this may mean that multiple orthologs of that gene in 
another species have arisen by duplication or could be due to 
spurious similarity matched in tRNA flanking regions caused 
by undetected and unmasked repeat sequences. For each tax- 
onomic group, we compiled a list of ortholog sets by applying 
single-linkage clustering to the tables of mappings. Table 1 
shows that the number of tRNAs with syntenic orthologs in 
other species vary considerably: >90% for the primates and 
the Drosophilids, 68% for the Saccharomycetes tRNAs, and 
only 44% for the nematode worm tRNAs. In general, high 
numbers of tRNAs mapped between genomes indicate that 
the tRNA complements are highly conserved between spe- 
cies. The low synteny conservation among the nematode 
tRNAs suggests either a high rate of tRNA gene turnover or 
a recent trend of tRNA gene proliferation. On the other 
hand, the yeast and bacterial tRNA complements could be 
considered as relatively stable, especially given the greater 
scales of evolutionary divergence within those groups. 

Anticodon shifts 

We used our sets of predicted tRNA orthologs to identify 
potential changes in anticodon sequences. The criteria em- 
ployed here to identify so-called anticodon shifts are similar 
to those reported previously (Rogers et al. 2010). However, 



in order to study the structural changes associated with 
tRNA anticodon shifts, we have chosen to restrict ourselves 
to a high- confidence set of potential shifts. To that end, we 
have focused on shifts that have occurred since the last com- 
mon ancestor of the clade and that have supporting evidence 
from more than one test (see Materials and Methods). These 
criteria are therefore stricter than those previously used by us 
and others (Rogers et al. 2010; Wang and Lavrov 2011). 

A total of 233 ortholog sets were found to contain tRNAs 
with different anticodon sequences (for details of the antico- 
don shift-containing ortholog sets for each taxa, see Sup- 
plemental File 3). After removing low-confidence orthologs 
with dissimilar sequences and removing tRNAs with unstable 
structures (see Materials and Methods), we obtained a fil- 
tered set of 75 predicted anticodon shifts: 31 alloacceptor 
shifts (involving a predicted change in tRNA identity) and 
44 isoacceptor shifts (a change between anticodons specifying 
the same identity). 

We considered a typical eukaryotic tRNA complement of 
46 tRNA anticodons representing 21 identities (the 20 
directly encoded identities and selenocysteine) (Marck and 
Grosjean 2002). Within this set, there are 414 possible sin- 
gle-base anticodon substitutions: 299 of these would produce 
an anticodon coupled to a codon specifying a different iden- 
tity, 101 the same identity, and 14 changes to anticodons 
complementary to STOP signals (suppressor mutations). 
Thus if the anticodon shifts occurred by chance with no con- 
straint, we would expect approximately three times as many 
alloacceptor shifts as isoacceptor shifts. However, we observe 
slightly more isoacceptor shifts, suggesting significantly 
greater constraint on alloacceptor shifts (P < 0.0001, x 2 test). 

In total, there are 71 anticodon shifts involving single-base 
anticodon changes (43 isoacceptor shifts and 28 alloacceptor 
shifts) (Table 2), and four two-base changes (three of which 
are alloacceptor shifts). Substitutions of the middle anti- 
codon base (position 35) are most common in alloacceptor 
shifts, followed by the third anticodon position. Thirty- 
nine out of the 44 isoacceptor shifts are substitutions of 
the first anticodon base. The single isoacceptor shift involv- 
ing two anticodon base changes involves a switch between 
tRNA-Leu(AAG) and tRNA-Leu(UAA) in nematodes. 



TABLE 1. Total numbers of tRNA genes annotated (excluding sequences annotated as pseudogenes by tRNAscan-SE) and with putative 
syntenic orthologs in other species 





Primates 


Drosophila 


Nematodes 


Saccharomycetes 


E nterobacteri aceae 




(5) 


(12) 


(6) 


(11) 


(61) 


LCA (mya) 


45 


62 


191 


417 


900 


Mean no. of tRNAs per genome 


522 


289 


911 


243 


73 


Mean no. (%) of tRNAs with syntenic 


495 (94) 


265 (92) 


401 (48) 


1 64 (66) 


59 (77) 


orthologs 












No. of alloacceptor shifts detected 


21 


5 


4 


0 


1 


No. of isoacceptor shifts detected 


10 


12 


11 


8 


3 



Last common ancestor (LCA) divergence times were derived from TimeTree (Hedges et al. 2006). 



www.rnajournal.org 271 



Rogers and Griffiths-Jones 



TABLE 2. Observed frequencies 
binations of anticodon mutation i 
shifts 


of the seven 
n alloacceptor 


possible com- 
and isoacceptor 




All 


Alio 


I so 


S x x 


45 


6 


39 


S S x 


2 


2 


0 


S S S 


0 


0 


0 


x x S 


13 


9 


4 


x S S 


1 


1 


0 


S x S 


1 


0 


1 


x S x 


13 


13 


0 


The position of a substitution in the anticodon is indicated by an 
S, while absence of substitutions is indicated by an x. 



Visual inspection of the alignment suggests an insertion or 
deletion of one base in the anticodon loop is responsible, 
rather than two independent substitutions. The most fre- 
quent substitution is between U and C (24 of 79) (Table 3), 
which likely reflects the unique ability of first position antico- 
don U bases to pair with both A and G bases in the third co- 
don position. Twenty- six percent of alloacceptor shifts and 
39% of isoacceptor shifts involve switches between these bas- 
es. The identity of the bases involved in alloacceptor shift mu- 
tations appears to be much more evenly distributed than 
isoacceptor changes (Table 3). None of the shifts we detect 
involve initiator tRNAs. 

Table 1 shows the numbers of alloacceptor and isoacceptor 
anticodon shifts detected in each taxonomic group. The num- 
bers of anticodon shifts correlates with the average number 
of mapped tRNA genes present in that group (Pearson's r = 
0.88) and with the average genome size (Pearson's r = 0.89). 
Putative primate anticodon shifts were the most common, 
comprising 41% of the total detected, despite this group 
containing only five genomes and the shortest evolutionary 
divergence of the groups tested. In contrast, no shifts were 
detected in yeast, and only one shift was detected in Enterobac- 
teriaceae, despite the numerous species in the latter data set 
(61) and their much higher evolutionary divergences. Despite 
the high divergence time, lack of orthologous tRNA identifica- 
tions is not responsible: Our synteny strategy identifies ortho- 
logs for 77% of tRNAs in these bacteria. Thus we suggest that 
the relative absence of anticodon shifts is related to the lower 
redundancy of tRNAs in these genomes. In Drosophila, nem- 
atode worms, Saccharomycetes, and Enterobacteriaceae, we 
found a greater number of isoacceptor shifts than alloaccep- 
tor shifts, at least twice as many in each taxonomic group. 
However, in primates this trend is reversed; we identified twice 
as many potential alloacceptor shifts as isoacceptor shifts. The 
reason behind this excess of alloacceptor shifts in primates is 
obscure. 

Among the 30 eukaryotic alloacceptor anticodon shifts, 17 
involve a transition between anticodons that couple with dif- 
ferent classes of aaRS classes, eight shifts switch between dif- 
ferent class I synthetases, and five between class II synthetases 



(Table 4). The high number of synthetase "transition" shifts 
is not attributable to clustering of the anticodons that each 
synthetase class uses; by random single-base substitutions, 
class I and class II synthetase- coupled anticodons have simi- 
lar likelihood of switching to an anticodon associated with 
the same or different synthetase classes. (Averaged over all 
identities, the numbers of distinct alloacceptor shifts possible 
by random single base substitution for the tRNAs of a given 
class are as follows: class I^-class I, 3.5; class I->class II, 3.8; 
class II->class I, 4.2; and class Il^-class II, 3.8.) The observed 
set of alloacceptor shifts include 27 (33%) of the 81 different 
identity pairs that are possible to achieve with single antico- 
don base mutations. For 12 synthetase transition identity 
shifts, we can predict the direction of the shift by parsimony: 
Eight are from class II to class I synthetase identities, and four 
are from class I to class II synthetase identities. 

We found isoacceptor shifts involving 14 out of 19 capable 
tRNA identities (all except Asp, His, He, Met, and Tyr; note 
that Trp and SeC identities have only one tRNA alloaccep- 
tor). In contrast to the alloacceptor shifts, the isoacceptor 
shifts show no bias in the synthetase classes in which they oc- 
cur. Among the 44 isoacceptor switches, we observe 22 in 
class I synthetase identities and 22 in class II synthetase iden- 
tities (Table 4). There is a positive correlation between the 
number of anticodons in an isoacceptor family and the num- 
ber of shifts observed between those anticodons (P = 0.04, 
Pearson's correlation). 

Covarying mutations 

Among orthologous tRNA families, we find large variation in 
the number of extra- anticodon mutations and in the posi- 
tions of these mutations. By using pair-entropy normalized 
mutual information (MI) as a measure of covariation with 
the anticodon, we quantified the covariance occurring at 
each site with respect to the anticodon, across all alignments 
that have anticodon shifts (see Materials and Methods; for 
details of calculated covariance in each anticodon shift align- 
ment, see Supplemental File 4). The summed MI score across 
all alignment positions for an anticodon shift is a measure of 
the number of mutations (including insertions and dele- 
tions) at bases other than the anticodon that covary with 



TABLE 3. Observed freq 


uencies of the six possible combinations 


of base su 


bstitutions 


occurring in the anticodon during 


alloacceptor and isoacceptor shifts 


Substitution 


All 


Alloacceptor shifts Isoacceptor shifts 


A«C 


8 


5 3 


A«G 


17 


8 9 


A«U 


13 


4 9 


c«c 


8 


5 3 


Gw-U 


24 


6 18 




9 


5 4 
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TABLE 4. Numbers 


of observed alloacceptor and 


isoacceptor 


shifts for each tRNA functional 


class 




Alloacceptor shifts 


Synthetase class 




Identities 


No. of shifts 


Class 1 


Val 


lie 


1 




Val 


Leu 


1 




Val 


Met 


1 




lie 


Gin 


1 




Leu 


Met 


1 




Cys 


Tyr 


1 




Cys 


Arg 


1 




Gin 


Arg 


1 


Class II 


Thr 


Asn 


1 




Thr 


Ala 


1 




Lys 


Asn 




Transition 


lie 


Ser 


1 




lie 


Phe 


1 




Leu 


Pro 


1 




Leu 


Phe 


1 




Leu 


Del 






Met 


Thr 






Met 


Lys 






Cys 


Ser 






Cys 


SeC 






Tyr 


His 






Glu 


Gly 






Glu 


Ala 






Arg 


Gly 






Arg 


His 






Val 


Gly 






Val 


Ala 




Isoacceptor shifts 


Synthetase class 


Identity 


No. of codons 


No. of shifts 


Class 1 


Val 


4 


3 




Leu 


6 


4 




Cys 


2 


3 




Glu 


2 


4 




Gin 


2 


2 




Arg 


6 


4 


Class II 


Ser 


6 


3 




Thr 


4 


2 




Pro 


4 


4 




Gly 


4 


4 




Asn 


2 


2 




Ala 


4 


5 




Lys 


2 


1 





the anticodon mutation(s). Since prokaryotic aaRS-tRNA 
systems differ considerably from eukaryotic systems and 
since we have detected few anticodon shifts in the bacterial 
data set, the following analyses focus on the eukaryotic data 
sets only. 

Alloacceptor shifts 

The 30 eukaryotic alloacceptor anticodon switches have a 
mean summed MI score of 3.9. This varies greatly among 



the synthetase classes involved (J 5 = 0.018, ANOVA): Class I 
synthetase identity switches have an average MI score of 
6.3, and switches among class II identities only 2.2. Switches 
between the two synthetase classes (trans identity shifts) 
have an average MI of 3.2. The large majority of eukaryotic 
alloacceptor shifts (25 out of 30) involve a unique pair of 
identities. The alloacceptor shift identity pairs that occur 
more than once comprise two Met/Thr shifts and three 
Lys/ Asn shifts. 

We summed the MI score contributions of all alloacceptor 
shifts at each structural position, each alignment position 
thus having a total MI score (^MI) that reflects the strength 
of its covariance with the anticodon over all alloacceptor 
shifts. The data show that there are substantial differences 
in covariance at different positions in the consensus tRNA 
model (Fig. 1). Nine consensus positions and the frequently 
covarying nonconsensus position 37/38 have a summed MI 
score >1 standard deviation (SD=1.19) from the mean 
(1.36) (Fig. 1). The nonconsensus position 37/38 is the site 
of introns in some eukaryotic tRNAs (Marck and Grosjean 
2002), and we see both insertions and substitutions at these 
positions. 

The positions that covary most with anticodon mutations 
are located across all domains of the tRNA molecule. Five 
of the 10 top-scoring sites (37, 27/38, 39, 72, and 73) are lo- 
cated in the anticodon and acceptor domains; these two distal 
ends of the structure are directly involved in the key interface 
functions of the tRNA molecule and are known to be im- 
portant identity determinants for several aaRS-tRNA systems 
(Giege et al. 1998). However, the remaining five positions 
are not known as primary identity elements. In particular, 
the T, D, and variable domains have important structural 




0 2 14 It, 3 8 5.9 



FIGURE 1 . Alloacceptor shift covariance heat map showing how extra- 
anticodon mutations covary with anticodon changes. Color indicates 
the SD from the alloacceptor consensus mean of summed MI (JMI) 
at each position: dark blue, <— 1 SD; blue, —1 SD; green, mean; yellow, 
+ 1 SD; bright red, +2 SD; and dark red, >+2 SD. The nonconsensus po- 
sitions 20/21, 27/28, and 47/48 are indicated by arrows. 
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roles (Marck and Grosjean 2002) but are typically known to 
harbor minor or partial identity elements that are used in 
only one tRNA system (Giege et al. 1998). These results there- 
fore suggest more complex structural rearrangements that in- 
volve coordinated mutation of other domains of the tRNA 
molecule are required for different aaRS systems or for inter- 
action with the ribosome, for example (Buchan et al. 2006; 
Saks and Conery 2007). 

Perhaps the most striking feature of the alloacceptor shift 
mutations is their heterogeneity. None of the top-scoring 
positions appears heavily biased toward any particular 
pair of identities, or any group of species (for details of 
mutations at these sites, see Table 5). Only one site appears 
biased toward a class of synthetase (position 12, class I), al- 
though the number of shifts (four) is low. The highest- 
scoring position (45) covaries with the anticodon in seven 
different alloacceptor shifts: only 23% of the 30 detected in 
eukaryotes. 

Out of the 30 alloacceptor shifts in eukaryotes, only two 
pairs of identities appear more than once: Asn/Lys three 
times and Met/Thr twice. One of the three Asn/Lys identity 
shifts, occurring in Drosophila, consists of only a GTT > 
TTT anticodon substitution with no extra-anticodon muta- 
tions. tRNA-Lys identity is known to be largely dependent 
on the anticodon in mammalian aaRS-tRNA systems 
(Stello et al. 1999; Francin and Mirande 2006), so no further 
mutations may be required for lysine charging of this tRNA. 
The remaining two Asn/Lys shifts are GTT-to-CTT antico- 
don mutations, both in primates, but are associated with 
different extra-anticodon mutations. The only mutation 
site common to both, position 72, involves substitutions of 
C for different bases: U and G. Similarly the two Met/Thr 
shifts, both involving Met(CAT) and Thr(CAA), have differ- 
ent associated mutations. One occurs in fly and has no 
strongly covarying mutations outside the anticodon substitu- 
tion, and the other is in primates, with five extra-anticodon 
mutations. 

For the alloacceptor shifts with covarying mutations visible 
at major identity determining sites 37, 72, and 73, we com- 
pared these mutations with known eukaryotic identity deter- 
minants reported in the literature (Table 5). Out of nine 
unique shifts, three appear to coevolve according to known 
identity rules. One shift matches a likely identity determinant 
for tRNA-Glu (A72) (Marck and Grosjean 2002). Two shifts 
involving tRNA-Lys and tRNA-Asn do not contradict any 
known identity determinants, since the posterior state 
tRNA-Lys identity is dependent mainly on the anticodon 
(Stello et al. 1999; Francin and Mirande 2006). Out of the re- 
maining six shifts, two have mutations that contradict the 
known identity determinants of one anticodon state: These 
are for tRNA-Tyr (G72) (Lee and RajBhandary 1991) and 
tRNA- Leu (A73 and A37) (Breitschopf and Gross 1994; 
Breitschopf et al. 1995). For the remaining four, we found 
no relevant identity determinant data to support or contra- 
dict coevolution at these sites. 



Isoacceptor shifts 

Of the 41 isoacceptor shifts identified in eukaryotes, we note 
that Glu, Gin, Leu, Lys, Ser, Gly, and Pro each have at least 
one isoacceptor shift without any extra-anticodon mutations 
that strongly covary with the anticodon. However, on aver- 
age, the isoacceptor shifts have approximately the same num- 
ber of extra-anticodon mutations as the alloacceptor shifts 
(average total MI, 4.1 and 3.9, respectively). This is perhaps 
a surprise and indicates the presence of anticodon-dependent 
structural constraints in the tRNA molecule separate from 
those imposed by the aaRSs. 

There are 1 1 consensus positions that have a total MI score 
of >1 SD (1.57) above the mean (2.06), together with a single 
nonconsensus position between positions 47 and 48, which is 
associated with multiple isoacceptor shifts (Fig. 2). The loca- 
tion of the sites that covary most with the anticodon is very 
different for isoacceptor shifts top-scoring compared with 
alloacceptor shifts (Fig. 2). For isoacceptor shifts, the distal 
end of the acceptor stem and the anticodon loop generally 
have low MI scores; instead, the top-scoring sites are almost 
all located in the middle sections of the tRNA structure, which 
in general are not associated with previously described iden- 
tity elements. The four highest-scoring sites are located in 
loop regions: of the D (16), T (57 and 59), and variable loops 
(44). Four further high-scoring sites (six, seven, 66, and 67) 
are located at the base of the acceptor stem. Only one high- 
scoring site is located at distal end of the molecule: position 
32 on the 5' side of the anticodon loop, which is noteworthy 
since in alloacceptor shifts covarying mutations in the antico- 
don loop and stem tend to be on the opposite 3' side. 

The variable loop has two covarying consensus positions, 
44 and 48. Interestingly, in alloacceptor shifts the adjacent 
position 45 is the predominant site of covarying mutations, 
while position 44 has none. The only nonconsensus position 
that is a common mutation site in the isoacceptor shifts is 
also located in the variable loop. Position 47/48 is mutated 
in five isoacceptor shifts, only one involving an insertion or 
deletion and the remaining four being substitutions (Fig. 
2). The variable domain is thus the only structural region 
that is strongly associated with both alloacceptor and isoac- 
ceptor shifts. 

Like the alloacceptor shifts, the most frequently mutated 
positions among isoacceptor shifts are all heterogeneous in 
tRNA identity (Table 6). They also comprise different synthe- 
tase classes, except position 6, which covaries with the antico- 
don only in class I synthetase tRNAs. There are a few visible 
trends in the covarying mutations when we consider which 
bases are being mutated. For example, although the identities 
involved are different, five different sites have covarying mu- 
tations that are consistent in the bases that are mutated: seven 
isoacceptor shifts contributing to the top-scoring site 57 un- 
dergo A/G substitutions; five out of six mutations at position 
66 (hybridized site) are U/C substitutions; all five substitu- 
tions at position 7 (hybridized site) are between A and G 
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TABLE 5. Mutations covarying with alloacceptor anticodon shifts at high-scoring and nonconsensus sites 

Known eukaryotic 



Site 


Order 


Anticodon 


Bases 


Anticodon 


Bases 


Synthetase class 


identity determinants 


45 


Worm 


lie TAT 


1 1 U 


Gln_TTG 


1gap 








Fly 


Asn_GTT 


1 1 G 


->Thr GGT 


1 A 


II 






Primate 


Asn_GTT 


1 A 9G 


->Lys_CTT 


1 A 


II 






Primate 


Gly_TCC 


5G 


-»Glu__TTC 


1 gap 


II— »l 






Primate 


Gly_CCC 


1 gap 3G 


— »Val_CAC 


1 C 


II— >l 






Primate 


Cys_GCA 


2A 


SeC_TCA 


1 G 


l/l I 






Primate 


k A rAT 

Met__LA I 


1 G 


LeuLAA 


1gap 






72 


Primate 


Asn_GTT 


1 0C 


->Lys_CTT 


1 U 


II 






Primate 


Cys_GCA 


1U 13C 


-»Tyr_GTA 


1 U 




Tyr: C1 :G72 




Primate 


Gly_TCC 


5C 


— »Glu_TTC 


1 A 


II— A 


GIu: A/z 




Primate 


Met_CA I 


1 C 


Thr_CGT 


1 G 


l/l I 


I hr: G1 :L72 




Primate 


A m rTT 


1 UL. 


— »Lys_L I i 


1 c 
I u 


it 




1 2 


Worm 


Val_CAC 


5G 


->Met_CAT 


1gap 


I 






Primate 


lle_TAT 


1 G 


Val_TAC 


1 U 


I 






Primate 


Gly_TCC 


5A 


->Glu_TTC 


1 G 


ll->l 






Primate 


Leu_TAA 


1 U 3C 


— >Val_TAC 


1 U 


I 




41 


Fly 


Asn_GTT 


1 1 G 


-»Thr GGT 


1 C 


II 






Primate 


\\rx\ PAT 

/viei_CA i 


I L. 


| /™A A 


1 1 i 
I u 








Primate 


Met_CAT 


1 C 


Thr_CGT 


1 U 


l/l I 




5 I 


Primate 


L,ys_Lil_A 


2C 


bec_ 1 L-A 


1 1 i 
I u 


i /i i 
l/ll 






Primate 


Cys_GCA 


6C 


Arg GCG 


3U 








Primate 


Gly_CCC 


24G 


— *Arg_CCG 


1gap 


11—^1 




73 


Primate 


Arg_TCG 


3G 


Gln_TTG 


1 A 


1 






Primate 


PheGAA 


4A 


Leu_CAA 


1G 


l/ll 


Leu: A/J, Phe: A/J 




Primate 


Met_L,A I 


1 A 

1 A 


tu, ft — r 


1 Li 


i /i i 
l/ll 


Met. A/j 


39 


Worm 


Val_CAC 


5G 


k a ~* /" a ~r 

->Met_LA 1 


1 A 








Primate 


Ala__AGC 


4G 


— >Val_AAC 


6U 


ll-»l 






Primate 


His_GTG 


1 2C 


— »Arg_GCG 


1 U 


II— >l 




37 


Worm 


lleTAT 


1 1 G 


Gln_TTG 


1 g a P 








Fly 


Tyr_GTA 


1 1 G 


-»His_GTG 


1 A 


l->ll 






Pri mate 


Pho CAA 


-+ V i 


1 gi i C A A 


i ^ 


i/ii 
i/n 


1 oi i- A T 7^ 

Leu. aj/ 


26 


Primate 


Arg_TCG 


3G 


Gln_TTG 


1 A 








Primate 


MeLCAT 


1G 


Leu_CAA 


1A 








Primate 


Asn_GTT 


10G 


-»Lys_CTT 


1A 


II 




37/38 


Fly 


Tyr_GTA 


11a 


-»His_GTG 


1 u 


l^ll 






Primate 


lle_TAT 


1a 


VaLTAC 


1 c 








Primate 


Met_CAT 


1a 


Leu_CAA 


1gap 








Primate 


Met_CAT 


1a 


Thr_CGT 


1gap 


l/ll 




47/48 


Primate 


Gly.TCC 


5gap 


^GIuJTC 


1 a 


ll-»l 






Primate 


Phe.GAA 


4u 


lle_GAT 


1gap 


l/ll 






Primate 


Gly_CCC 


1 a 23gap 


^Arg_CCG 


1 g 


IM 




20/21 


Primate 


Arg_TCG 


3u 


Gln_TTG 


1 gap 








Primate 


Phe^GAA 


4gap 


Leu_CAA 


1 g 


l/ll 





Where directionality can be assigned, the resulting gene is indicated by an arrow. 37/38, 47/48, and 20/21 are nonconsensus positions. 
Known identity determinants for relevant tRNAs at positions 37, 72, and 73 are listed. Multiply occurring tRNA identities are indicated by 
bold type. 

a Fechter et al. (2000). 
b Marck and Grosjean (2002). 
c Nameki (1995). 
d Breitschopf and Gross (1 994). 
e Nazarenko et al. (1992). 
'Senger et al. (1992, 1995). 



bases; and position 48 mutations all involve C, while at the 
adjacent nonconsensus position 48/47 all involve U residues. 
These examples suggest that evolution at these sites is con- 
strained in a consistent manner among the different tRNA 
identities. 



DISCUSSION 

The tRNA multigene family comprises very short genes, 
which can be very highly conserved and yet are subject to 
high turnover (Bermudez-Santana et al. 2010; Rogers et al. 
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0.5 2.1 16 52 66 



FIGURE 2. Isoacceptor shift covariance heat map showing how extra- 
anticodon mutations covary with anticodon changes. Color indicates 
the SD from the alloacceptor consensus mean of summed MI (JMI) 
at each position: dark blue, <— 1 SD; blue, —1 SD; green, mean; yellow, 
+ 1 SD; bright red, +2 SD; and dark red, >+2 SD. The nonconsensus po- 
sition 47/48 is indicated by an arrow. 



2010). Using sequence similarity methods, it is hard to justify 
a similarity threshold that both includes diverging orthologs 
and discards nonorthologous sequences, which may only dif- 
fer by a few nucleotides. A purely phylogenetic approach is 
unlikely to reliably identify the close orthologs of a tRNA 
gene that has undergone an anticodon shift, since it may 
have accumulated several additional mutations, some of 
which may be convergent to the sequence patterns of another 
functional class. We used a synteny- mapping approach to 
finding tRNA sets of orthologs for the purpose of describing 
putative anticodon shifts. 

Our analysis relies on the accurate prediction of tRNA 
genes and their identities. We use the tool tRNAscan-SE 
(Lowe and Eddy 1997), which has been shown to be highly 
sensitive and specific. However, we cannot rule out the pos- 
sibility that some predicted tRNA genes are pseudogenes, are 
not actively transcribed, and thus are free to accumulate mu- 
tations. However, our method requires that tRNA genes have 
extant orthologs and therefore only includes tRNAs that are 
syntenically conserved in at least one other genome. While 
we expect that most conserved tRNA genes are likely to be 
transcribed and functional, we further attempted to mini- 
mize the effect on our analyses of such potential "inactive" 
tRNA genes by restricting inclusion in the ortholog sets to 
genes that were annotated with high scores, indicating a 
high degree of conformity with the consensus structure. 

We have detected tRNA anticodon shifts in all taxonomic 
groups tested: primates, flies, worms, yeast, and bacteria. We 
find the largest number of anticodon shifts between five pri- 
mate species (31; 41% of the total set of shifts identified), 



while only four were detected in the large group of 61 
Enterobacteriaceae. The number of shifts detected in each 
group is correlated with the average size of that group's 
mapped tRNA gene complement. This suggests that antico- 
don mutations are facilitated by a high redundancy of 
tRNA gene in a genome. In species with high tRNA gene 
copy numbers, loss of one tRNA gene may have relatively lit- 
tle impact on overall tRNA population in the cell for that 
tRNA functional type. 

Coevolution in alloacceptor and isoacceptor 
anticodon shifts 

In contrast with the random expectation, we detect more iso- 
acceptor shifts than alloacceptor shifts in all species groups. 
This implies greater constraint on alloacceptor shifts than 
isoacceptor shifts, as might be expected. However, we find lit- 
tle difference between the categories in their average numbers 
of covarying mutations outside the anticodon, as quantified 
by MI (average ZMI: 3.9 and 4.1, respectively). Mutations 
that coevolve with alloacceptor anticodon shifts are expected 
to ensure that the amino acid identity matches the new anti- 
codon. Indeed, many observed mutations do match those 
sites commonly associated with tRNA identity determinants. 
However, extra- anticodon mutations associated with iso- 
acceptor shifts are harder to rationalize and have not been 
previously described. We suggest that these changes may, 
at least in part, reflect particular properties of mutations 
to the first (5') anticodon base, which is the site of nearly 
all (87%) isoacceptor shift anticodon mutations. Consider- 
ing that groups of isoacceptor tRNAs can contain different 
antideterminants encoded by distinct sequence elements 
(Fender et al. 2004) and that the anticodon has a prominent 
place in most identity systems, we speculate that coordinate 
interactions between the 5' anticodon base and the distinct 
identity elements of isoacceptors may drive the coevolution 
of both. 

In recent years, a handful of studies have addressed the 
functional relevance of tRNA sequence variations in transla- 
tion, independent of the tRNA's role in aminoacylation. 
Geslain and Pan (2010) showed that tRNAs with the same an- 
ticodon but different sequences (known as isodecoders), 
which are particularly common in mammalian genomes, 
have similar stability and aminoacylation activity but differ- 
ent stop-codon suppression activity and, thus, translational 
efficiency. By examining codon pair choices in open reading 
frames, Buchan et al. (2006) found evidence that nucleotide 
interactions between the tRNAs occupying ribosome A-site 
and P-site compartments influenced codon pair preference 
and translational performance, thus suggesting tRNA se- 
quence elements, including, but not limited to, the antico- 
don, may be fine-tuned to alter ribosomal interactions. 
Saks and Conery (2007), studying the sequence conservation 
of bacterial tRNA genes grouped according to anticodon trip- 
lets, found that several bases were conserved in an anticodon- 
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TABLE 


6. Mutations covarying 


with isoacceptor anticodon 


shifts at 


high scoring and 


nonconsensus mutation sites 












Urder 


Al ILICUUOI 1 


Bases 


Anticodon 


Bases 


Synthetase class 


57 


Worm 


Pro_TGG 


3A 


Pro_CGG 


I VJ 






Worm 


Ala_CCC 


1A 


Ala_TGC 




ii 
ii 




Fly 


Cys_GCA 


6A 


-»Cys_ACA 


1 C 
I KJ 


i 
i 




Fly 


Arg_TCG 


11G 


->Arg_TCT 


1 A 


1 




Primate 


Asn_GTT 


6A52G 


^Asn_ATT 


A A 


ii 
ii 




Yeast 


Pro_AGG 


1G 


^Pro_TGG 


6A 


II 




Yeast 


VaLCAC 


3G 


->Val_TAC 


1 A 


i 


16 


Worm 


Ala_AGC 


2U 


Ala_CGC 


1 c 


1 1 
ii 




Worm 


Pro_TGG 


3U 


Pro_CGG 


1 c 

I Lj 


ii 
ii 




Worm 


Leu_AAG 


1U 4C 


Leu_TAG 


6U 


I 




Fly 


Cys_GCA 


6G 


->Cys_ACA 


1 A 


I 




Fly 


Ser_AGA 


16C 


->Ser_CGA 


1 I ! 


ii 
ii 




Primate 


Cys_GCA 


1U 1G 


Cys_ACA 


1 c 


i 




Primate 


Thr_TGT 


1G 


Thr_CGT 


1 C 1 f 
I L- I <u 


1 1 
ii 




Yeast 


Pro_AGG 


1A 


->Pro_TGG 


Al I 


ii 
ii 


44 


Worm 


Glu_CTC 


4A 


GluJTTC 


AC 


i 
i 




Fly 


Gly_GCC 


10C 


->GIy_CCC 


Z U 


ii 
ii 




Fly 


Gly_GCC 


9C 


Gly.CCC 


3U 


II 




Fly 


Arg_TCG 


11A 


->Arg_TCT 


1 c 


i 
i 




Primate 


Asn_GTT 


58A 


->Asn_ATT 


DL 


ii 
ii 




Primate 


Leu_CAG 


5U 


-»Leu_CAA 


ZL 


i 


59 


Worm 


VaLAAC 


1U 2G 


VaLTAC 


2U 


I 




Worm 


Pro_TGG 


5U 


Pro_CGG 


"3. A 


ii 
ii 




Fly 


Arg_ACG 


31C 


-»Arg_TCG 


1 C 


i 




Primate 


Cys_GCA 


2A 


Cys_ACA 


1 C 


i 




Yeast 


Pro_AGG 


1G 


^Pro.TGG 


C\ I 

ou 


1 1 
ii 




Yeast 


VaLCAC 


3A 


-►VaLTAC 


1 U 


I 


66 


Worm 


VaLTAC 


5U 3C 


VaLCAC 


9C 


I 




Fly 


Gly_GCC 


10C 


^GIy_CCC 


zU 


ii 
ii 




Fly 


Arg_CCT 


13C 


-»Arg_TCT 


1 U 


I 




Primate 


Thr_TGT 


1G 


Thr_CGT 


ZU 


ii 
ii 




Yeast 


Glu_TTC 


11U 


Glu_CTC 


1 1 1 if 


i 




Yeast 


Ala_AGC 


7C 


Ala_TGC 


71 1 1 C 


ii 
ii 


48 


Fly 


Gly_GCC 


10C 


^GIy_CCC 


9 I ! 

ZU 


ii 
ii 




Fly 


Gly.GCC 


9C 


Gly.CCC 


"3 I I 
JU 


ii 
ii 




Primate 


Ala_CGC 


5C 


->Ala_TGC 


1 I I 


ii 
ii 




Primate 


Cys_GCA 


2C 


Cys_ACA 




i 
i 


67 


Worm 


Ala.AGC 


2G 


Ala.CGC 


1 A 
I A 


1 1 
ii 




Fly 


Ser_AGA 


16G 


->Ser_CGA 


1 A 
I A 


ii 
ii 




Yeast 


Glu_TTC 


11A 


Glu_CTC 




i 




Yeast 


Ala.AGC 


7U 


Ala.TGC 


ou 


ii 
ii 


6 


Worm 


Ala_AGC 


2C 


Ala_CGC 


1 I ! 


ii 
ii 




Worm 


Pro_TGG 


5A 


Pro_CGG 


3 f" 


ii 
ii 




Fly 


Ser_AGA 


16C 


->Ser_CGA 


1 U 


ii 
ii 




Yeast 


Ala_AGC 


7U 


Ala_TGC 


8C 


II 


32 


Worm 


Glu_CTC 


4C 


GluJTTC 


j.i i 


i 
i 




Fly 


Arg_TCT 


3U 


Arg_TCG 


1 C 


i 
i 




Yeast 


Glu_TTC 


11C 


Glu_CTC 


/I I I 


i 
i 




Yeast 


Ala_AGC 


7U 


Ala_TGC 




ii 
ii 


7 


Worm 


VaLTAC 


5A 3G 


VaLCAC 




i 
i 




Fly 


Gly_GCC 


1 i\f~~ 

I uu 




9 A 
Z/\ 


ii 
ii 




Fly 


Cys_GCA 


6A 


->Cys_ACA 


1G 


i 




Yeast 


Glu_TTC 


11A 


Glu_CTC 


4G 


i 




Yeast 


Ala_AGC 


7G 


Ala_TGC 


7A1G 


n 


64 


Worm 


VaLAAC 


3C 


VaLTAC 


1C 1G 


i 




Primate 


Asn_GTT 


10C 


^Asn_ATT 


1U 


n 




Primate 


Thr_TGT 


1G 


Thr_CGT 


2A 


n 




Yeast 


Pro_AGG 


1G 


^Pro_TGG 


6A 


n 




Yeast 


Glu_TTC 


22C 2G 


-Glu_CTC 


3G 


i 


(cont/nued) 



dependent manner rather than an identi- 
ty-dependent manner. Such residues in- 
clude the 32-38 pseudo-base pair at top 
of anticodon loop and residue 37, adja- 
cent to the third anticodon nucleotide. 
In line with Buchan et al. (2006), the in- 
vestigators suggest that structural interac- 
tions between the anticodon and other 
regions of the tRNA are important mod- 
ulators of translational efficiency at the 
ribosome. 

Since both aaRS interaction and ribo- 
somal interaction constrain tRNA struc- 
ture in a manner dependent on the 
anticodon, we expect mutations that co- 
evolve with alloacceptor shift mutations 
to affect both identity elements and ribo- 
somal interaction sites. In contrast, sites 
that covary with isoacceptor shifts should 
affect only the ribosomal interaction sites. 
Indeed, we find some agreement between 
the anticodon- dependent conservation 
data of Saks and Conery (2007) and the 
top covarying sites of the anticodon shifts, 
such as position 37 in the alloacceptor 
shifts and positions 32 and 44 in the iso- 
acceptor shifts; most of the key sites of co- 
variance among both categories are not 
those identified as conserved in an antico- 
don-dependent manner. 

Frequently covarying sites 

While the top-scoring covarying muta- 
tion sites for both alloacceptor and isoac- 
ceptor shifts are spread out over the tRNA 
molecule, there are notable differences 
between the two classes. Among the 
most frequently covarying sites in alloac- 
ceptor shifts are the tip of the acceptor 
stem (bases 72 and 73) and the 3' side of 
the anticodon loop (37, 37/38, and 39); 
these distal bases of the tRNA structure 
have been previously described as key 
identity determinants in a wide range of 
systems (Giege et al. 1998). In isoacceptor 
shifts, the same regions show very little 
covariance with anticodon substitutions. 
Thus, specific changes in the distal ends 
of the tRNA molecule are important 
and relatively common in the evolution 
of alloacceptor (but not isoacceptor) anti- 
codon shifts. The only region that is 
strongly associated with covariation in 
both categories of anticodon shift is the 
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TABLE 6. Continued 



Site 


Order 


Anticodon 


Bases 


Anticodon 


Bases 


Synthetase class 


47/48 


Worm 


Ala_AGC 


2G 


Ala_CGC 


1U 


II 




Primate 


Cys_CCA 


2U 


Cys_ACA 


1A 


I 




Primate 


Asn_GTT 


1C 56U 1G 


->Asn_ATT 


2U 4gap 


II 




Yeast 


Pro_ACC 


1U 


->Pro_TGG 


6C 


II 




Yeast 


VaLCAC 


3U 


-*Val_TAC 


1C 


I 



Where directionality can be assigned, the resulting gene is indicated by an arrow. 47/48 is 
a nonconsensus position. Multiply occurring tRNA identities and homogeneity in mutations 
are indicated by bold type. 



variable loop; however, the precise location of mutations is 
different: base 45 in alloacceptor shifts and 44, 47/48, and 
48 in isoacceptor shifts. The variable loop is important in de- 
termining tRNA-Ser (and thus selenocysteine) identity in 
both humans (Achsel and Gross 1993; Breitschopf and 
Gross 1994) and yeast (Himeno et al. 1997), yet there is 
only one anticodon shift involving a tRNA-Ser/SeC antico- 
don. The base pair of residues 27 and 43 is known to be in- 
volved in codon recognition (Yarus et al. 2003), and the 
base pair of residues 26 and 44 has been shown to conserved 
according to the anticodon (Saks and Conery 2007). Thus we 
speculate that the variable loop mutations may be involved in 
anticodon- dependent structural rearrangements to facilitate 
codon recognition, rather than aaRS interactions. 

Interestingly, the sites in and around the anticodon loop 
that covary with alloacceptor anticodon mutations are all lo- 
cated on the 3' side. When bound in the ribosome A-site, 
these nucleotides (37-39) face the closely adjacent P-site 
tRNA anticodon (Thermns thermophilus crystal structure: 
PDB accession nos. 1GIX and 1GIY) (Yusupov et al. 2001). 
Previous studies have found that chemical modifications of 
A-site tRNA nucleotides at these locations are important 
for reading frame maintenance and aminoacyl-tRNA selec- 
tion (Li et al. 1997; Urbonavicius et al. 2001, 2003). In their 
analysis of codon pair biases, Buchan et al. (2006) suggested 
that nucleotide identity and chemical modifications of these 
nucleotides 3' of the anticodon loop have a role in translation 
optimization, dependent on codon-pair usage. If the role of 
these nucleotides in translation optimization is related to 
the identities specified by codon pairs, it may explain why an- 
ticodon substitutions to the nucleotides 3' of the anticodon 
loop are common during identity (alloacceptor) shifts, but 
not isoacceptor shifts. 

Many mutation pathways to identity change 

A few covarying sites show preferences for certain synthetase 
classes and particular base substitutions. However, in general 
both alloacceptor and isoacceptor shifts show considerable 
heterogeneity in the extra-anticodon mutations that covary 
with their anticodon mutations. Many anticodon shifts in- 
volving the same transition of identities and anticodon 



seem to have very different mutations 
outside the anticodon. We also found at 
least two instances of extra-anticodon 
mutations that contradict known identi- 
ty determinants. However, we note that 
for some eukaryotic tRNAs (Arg, He, 
and Lys), the acceptor stem is known 
not to contain major identity determi- 
nants (Senger et al. 1997; Delagoutte 
et al. 2000; Francin and Mirande 2006). 
In these tRNAs, as well as those for Asp, 
Leu, Met, Phe, and Ser, identity is thought 
to be controlled to greater or lesser ex- 
tents by more complex structural features in other parts of 
the molecule (Sampson et al. 1989; Putz et al. 1991; Naza- 
renko et al. 1992; Senger et al. 1992, 1995; Achsel and Gross 
1993; Breitschopf and Gross 1994; Breitschopf et al. 1995; 
Aphasizhev et al. 1997; Himeno et al. 1997). Furthermore, 
identity determinants for Asn and Gin tRNAs have not been 
identified in eukaryotes. Our knowledge of identity determi- 
nants is limited by the technical nature of experimental deter- 
mination, which generally involves substituting particular 
residues, especially those in the acceptor stem, and swapping 
domains of different tRNAs. It is likely that identity rules 
vary in eukaryotes, as they are known to in bacteria 
(Freyhult et al. 2007). Thus, our results suggest that tRNA 
identity systems are governed by more complex and subtle 
rules than is currently appreciated. We base these conclusions 
on (1) the ubiquity and frequency of detected anticodon 
shifts, (2) the low average numbers of covarying mutations, 
(3) the conflicts with known eukaryotic identity rules report- 
ed, and (4) the dissimilar patterns of covarying mutations 
observed for the alloacceptor and isoacceptor shifts involving 
the same identities. 

According to the "ambiguous identity" hypothesis pro- 
posed by Ardell and Andersson (2006), new identity rules 
may evolve through spontaneous mutations that create and 
pass through an intermediate state of ambiguous tRNA 
charging or translational efficiency. We speculate that 
the tRNA anticodon shifts we describe represent different 
stages of this process: Some genes we describe may be in an 
ambiguous state, while others have switched to exclusively 
charge the new anticodon-specified amino acid. We have 
also shown that alloacceptor anticodon shifts that involve a 
transition between aaRS classes are common. Class I and II 
aaRSs approach the tRNA molecule from different sides 
(Lapointe and Brakier-Gingras 2003). We can therefore ex- 
tend the ambiguous identity hypothesis to include the possi- 
bility that both synthetases could bind to the same tRNA at 
the same time, facilitating the evolution of one state to the 
other. In general, the potential for a tRNA gene to undergo 
an identity switch must depend on a variety of interrelated 
factors: (1) both identity systems involved, including their 
determinants and anti-determinants; (2) the ribosome-inter- 
acting nucleotides of the tRNA, including the anticodon; and 
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(3) the levels of genomic redundancy, including the numbers 
of isoacceptors in each family. Despite these complex and 
interdependent constraints, our findings suggest that the bar- 
riers to switching identity states are frequently surmounted. 

MATERIALS AND METHODS 

Mappings and ortholog set compilation 

We used tRNAscan-SE (Lowe and Eddy 1997) in eukaryotic and 
bacterial mode, as appropriate, to annotate tRNAs in a total of 36 
eukaryotic species from four different taxa: five primates, six nem- 
atode worms, 11 fungi, 12 Drosophila flies, and 61 bacterial species, 
each from a different genus of the Enterobacteriaceae family. The list 
of species and genome assembly versions is provided in Supple- 
mental File 1, and we provide GFF lists of our tRNA annotations 
in Supplemental File 2. We used a flank-mapping method to detect 
regions of microsynteny conservation, as previously described 
(Rogers et al. 2010). Briefly, we extracted genomic flanking regions 
surrounding each tRNA. We masked all tRNAs in these genomic re- 
gions, together with all repeats using RepeatMasker (Smit et al. 
1996-2010) and the 2011-09-20 edition repeat libraries (Jurka 
et al. 2005). We then used WU-BLASTN 2.0MP (Gish 1996- 
2004) to search the tRNA flanking regions against the tRNA flanking 
regions from all other genomes of that group, using the following 
parameters: word size, 7; E-value threshold, 10~ 6 ; and the 
hspsepSmax and hspsepQmax parameters, 50 bases. A range of 
flanking sequence lengths were tested for each taxonomic group; 
the length used for the analysis was the length at which an increased 
size greatiy increased the number of mappings. These lengths were 
5 kb upstream and downstream in primates, nematodes, and flies 
and 1 kb upstream and downstream in Saccharomycetes and 
Enterobacteriaceae. The tRNAs in each taxonomic group with flank- 
ing sequence BLAST hits with E-values <10~ 6 were assembled into 
sets of orthologs by single linkage clustering. Since the aim is to 
detect anticodon shifts rather than to analyze rates of tRNA duplica- 
tion and deletion, unlike in our previous work (Rogers et al. 2010), 
we use a "relaxed" definition of the ortholog set concept, which po- 
tentially includes closely related in- and out-paralogs. 

Anticodon shift detection and validation 

We selected all ortholog sets containing more than one anticodon as 
potential anticodon shifts. We used the following criteria to further 
conservatively filter the data set: 

1. We discarded anticodon shifts involving tRNAs annotated by 
tRNAscan-SE with a score of <50 bits. This threshold was deter- 
mined by manual inspection of alignments of the low-scoring 
tRNAs involved in anticodon shifts: scores <50 bits were often 
characterized by structures with large truncations or insertions 
in usually nonvariable regions and by several substitutions that 
do not conserve base-pairings in stem regions. Thirty-four po- 
tential anticodon shifts were discarded at this step, 22 of them 
from primates. 

2. Anticodon shifts involving sequence pairs that are mutually 
widespread across species' tRNA complements are difficult to 
prove authentic, rather than the result of spurious mappings. 



We therefore discarded potential shifts that were inferred by 
Dollo parsimony (Farris 1977) to have arisen before divergence 
of the taxon group, based on the presence and absence of 
tRNAs of both anticodons in the various species, and topology 
of the genus tree. In cases where lack of mappings made this 
test uninformative, we searched a representative gene of the 
shifting tRNA type against the Transfer RNA Database Quilling 
et al. 2009). If the shifting genes resulted in identical or nearly 
identical matches outside the taxon group, the anticodon shift 
was discarded. 

3. We aligned the tRNA sequences of each ortholog set using 
ClustalW 2.0.12 (Thompson et al. 1994) and viewed the align- 
ment in Talview (Clamp et al. 2004). We counted the number 
of mutations between the sequences of tRNAs with the potential 
anticodon shift. Sequence pairs with more than 10 mutations 
were discarded as potential false ortholog calls. 

4. For potential alloacceptor shifts only, the following criterion was 
used to override discard according to tests 2 and 3. We searched a 
representative gene of each identity against Tfam 1.3 (Taquist 
et al. 2007), which classifies tRNA genes based on sequence mo- 
tifs outside the anticodon using known identity rules. If the shift- 
ing identity tRNA is predicted to have the same identity as its 
putative ortholog, rather than the identity specified by its own 
anticodon, this is taken as a confirmation of the shift and it is 
retained. 

Coevolution analysis 

In order to find which sites were coevolving with the mutations in 
the anticodon, we generated alignments of tRNAs for each validated 
anticodon shift by aligning the sequences against tRNAscan-SE's eu- 
karyotic covariance model (Lowe and Eddy 1997) using INFERNAL 
1.0.2 (Nawrocki et al. 2009). Alignment columns were numbered 
according to the Sprinzl et al. (1991) consensus tRNA numbering. 
Insertions with respect to the covariation model consensus were 
numbered x.01, x.02, etc. 

For each alignment, we assessed the coevolution of the antico- 
don with other sites in the tRNA sequence using the quantity 
MI (Chiu and Kolodziejczak 1991; Gutell et al. 1992). The MI be- 
tween two columns reflects the degree to which the pattern in the 
two columns is correlated. If bases occur independently at the two 
sites, the theoretical value for MI is zero. We used standard MI 
normalized by joint entropy of the two positions H{X\Y). The ef- 
fect of this normalization is to make full dependency between 
the two sites score 1, regardless of the relative numbers of sequenc- 
es of each tRNA type present. Normalization by pair entropy has 
been shown to perform best out of several variations of MI in 
detection of coevolving positions in protein sequences (Martin 
et al. 2005). 

For each extra-anticodon alignment position with a site entropy 
H(X) greater than zero, we calculated the MI with respect to each 
of the three anticodon positions; the highest score of the three was 
taken as the site's MI with respect to the anticodon. The method 
treats consensus positions and nonconsensus positions uniformly, 
so substitutions, insertions, and deletions are all potentially detect- 
able as covarying mutations. We summed the scores at each position 
for all alloacceptor shifts and isoacceptor group shifts, as well as 
alloacceptor shifts within both synthetase classes and transitions be- 
tween classes I and II. Structure "heat maps" for each class of shift 
were drawn using VARNA (Darty et al. 2009). 
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